Feature significance for multivariate kernel density estimation

نویسندگان

  • Tarn Duong
  • Arianna Cowling
  • Inge Koch
  • Matthew P. Wand
چکیده

Multivariate kernel density estimation provides information about structure in data. Feature significance is a technique for deciding whether features – such as local extrema – are statistically significant. This paper proposes a framework for feature significance in d-dimensional data which combines kernel density derivative estimators and hypothesis tests for modal regions. For the gradient and curvature estimators distributional properties are given, and pointwise test statistics are derived. The hypothesis tests extend the two-dimensional feature significance ideas of Godtliebsen et al. (2002). The theoretical framework is complemented by novel visualisation for three-dimensional data. Applications to real data sets show that tests based on the kernel curvature estimators perform well in identifying modal regions. These results can be enhanced by corresponding tests with kernel gradient estimators.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Ridge-Penalty Regularization for Kernel-CCA

CCA and Kernel-CCA are powerful statistical tools that have been successfully employed for feature extraction. However, when working in high-dimensional signal spaces, care has to be taken to avoid overfitting. This paper discusses the influence of ridge penalty regularization on kernel-CCA by relating it to multivariate linear regression(MLR) and partial least squares(PLS). Experimental result...

متن کامل

A Bayesian approach to bandwidth selection for multivariate kernel density estimation

Kernel density estimation for multivariate data is an important technique that has a wide range of applications. However, it has received significantly less attention than its univariate counterpart. The lower level of interest in multivariate kernel density estimation is mainly due to the increased difficulty in deriving an optimal data-driven bandwidth as the dimension of the data increases. ...

متن کامل

Multivariate Modality Inference Using Gaussian Kernel

The number of modes (also known as modality) of a kernel density estimator (KDE) draws lots of interests and is important in practice. In this paper, we develop an inference framework on the modality of a KDE under multivariate setting using Gaussian kernel. We applied the modal clustering method proposed by [1] for mode hunting. A test statistic and its asymptotic distribution are derived to a...

متن کامل

Comparison of the Gamma kernel and the orthogonal series methods of density estimation

The standard kernel density estimator suffers from a boundary bias issue for probability density function of distributions on the positive real line. The Gamma kernel estimators and orthogonal series estimators are two alternatives which are free of boundary bias. In this paper, a simulation study is conducted to compare small-sample performance of the Gamma kernel estimators and the orthog...

متن کامل

Bandwidth Selection for Multivariate Kernel Density Estimation Using MCMC

Kernel density estimation for multivariate data is an important technique that has a wide range of applications in econometrics and finance. However, it has received significantly less attention than its univariate counterpart. The lower level of interest in multivariate kernel density estimation is mainly due to the increased difficulty in deriving an optimal datadriven bandwidth as the dimens...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Computational Statistics & Data Analysis

دوره 52  شماره 

صفحات  -

تاریخ انتشار 2008